Clearly define a problem or an idea of your choice, where you would need to leverage the Foursquare location data to solve or execute. Remember that data science problems always target an audience and are meant to help a group of stakeholders solve a problem, so make sure that you explicitly describe your audience and why they would care about your problem.
Lehi in Utah is the one of the fastest growing cities in the United State of America. Lehi's nickname, Sillicon Slope, is an attractive place for companies such as Adobe, Microsoft, and Oracle, which have built campuses. Additionally, new startups are growing rapidly in this area. Lehi is close to Salt Lake City International airport and Sundance - where the Sundance film festival is held every winter. Lehi is also famous for it’s beautiful mountains and forest which attracts hikers and outdoor enthusiasts.
However, these various advantages and the increasing number of business travelers and tourists in Lehi face one challenge. One of the problems in this area is public transportation. Lehi’s rapid growth has highlighted an extreme need for improved public transportation - that is not as readily available as places like New York or San Francisco. A tourist wanting to visit the various sites around Lehi would find it difficult to hail a taxi or find a bus to reach their destination. Luckily, we have car sharing systems like Uber and Lyft, and we can also explore the area by utilizing the information from Google map or FourSquare before traveling.
As a travel agency manager, I am going to present this report to one of my clients who is going to visit Lehi for a tech conference in October. This client has never visited Lehi before and plans to stay 3 days. She is participating in the conference on the first day of the trip, and has already booked the hotel - Hyatt Place in Lehi. During her business trip, she wants to go hiking in the mountains or trails to experience the natural fall landscape. She is also planning to go shopping to get souvenirs for her friends and family. I was given the draft planner and charging on completing the planner. In order for her to have a successful business trip, I need to make a plan for the second and third day.
Describe the data that you will be using to solve the problem or execute your idea. Remember that you will need to use the Foursquare location data to solve the problem or execute your idea. You can absolutely use other datasets in combination with the Foursquare location data. So make sure that you provide adequate explanation and discussion, with examples, of the data that you will be using, even if it is only Foursquare location data.
import requests # library to handle requests
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import numpy as np # library to handle data in a vectorized manner
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from geopy.geocoders import Nominatim # library to convert an address into latitude and longitude values
# libraries for displaying images
from IPython.display import Image, display
from IPython.core.display import HTML
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
import folium # plotting library
#Importing images from the websites
from PIL import Image
import requests
print('Libraries imported.')
draft_planner = pd.DataFrame(index=['Morning','Evening','Night'],columns=['Day1','Day2','Day3'])
draft_planner.iloc[0,0] = 'Arriving at Salt Lake City Airport and heading to the hotel'
draft_planner.iloc[1,0] = 'Participating IT conference'
draft_planner.iloc[2,0] = 'Return to the hotel and find a restraunt for dinner'
draft_planner.iloc[2,2] = 'Heading to the airport'
draft_planner
dfs = pd.read_html('https://en.wikipedia.org/wiki/Lehi,_Utah')
len(dfs) #10 of dataframes are available.
climate = dfs[2]
climate.drop(climate.index[3], inplace=True)
climate
climate_oct.rename({0:'Average high °F (°C)', 1: 'Average low °F (°C)',2:'Average precipitation inches (mm)'}, axis='index', inplace=True)
climate_oct.rename({'Oct':'October'}, axis=1,inplace=True)
climate_oct
Based on data, the client can expect comfortable autumn weather in Lehi. However,she may need to pack a winter jacket because the temperture could get dropped as below as 34°F.
trip_lehi = pd.DataFrame({'Name of the place': ['Thanks Giving Point','Museum of Natural Curiosity', 'Museum of Ancient Life','Outlets at Traverse Mountain','Ashton Gardens','Hutchings Museum',"Cornbelly's Corn Maze & Pumpkin Fest",'Glass Art Institute at Thanksgiving Point','Lehi Roller Mills','Butterfly Biosphere','Pointe Meadow Park'],
'Category': ['Specialty Museums',"Children's Museums",'Natural History Museums','Factory Outlets','Scenic Walking Areas','Museum','Farms','Art Galleries','Historic Sites','Science Museums','Parks'],
'Hours':['Mon - Sat 10:00 AM - 8:00 PM','Mon - Sat 10:00 AM - 8:00 PM','Mon - Sat 10:00 AM - 8:00 PM','Mon - Sat 10:00 AM - 9:00 PM, Sun 11:00 AM - 6:00 PM','Temporarily closed','Mon - Sat 11:00 AM - 8:30 PM',np.nan,np.nan,np.nan,np.nan,np.nan]})
trip_lehi
#Remove the rows with NaN in Hours columns.
trip_lehi.dropna(inplace=True)
#Also remove the rows with 'Temporarily closed' in hours columns
trip_lehi.drop(index=4, inplace=True, axis=0)
trip_lehi
As per the client's request, I added 'Outlets at Traverse Mountain' to the plan. The museums such as 'Thanks Giving Point', 'Museum of Ancient Life' and 'Hutchings Museum' are the options I could consider.
# Removed unnecessary data and sorted the data by distance.
dataframe_food_filtered.drop(columns=['postalCode','cc','country'], inplace=True)
dataframe_food_filtered.sort_values(by='distance',axis=0, ascending=True)
dataframe_food_filtered
# Visualized the map:
venues_food_map = folium.Map(location=[latitude_hotel, longitude_hotel], zoom_start=12) # generated map centred around Hyatt Place
# added a red circle marker to represent Hyatt Place
folium.CircleMarker(
[latitude_hotel, longitude_hotel],
radius=10,
color='red',
popup='Hyatt Place',
fill = True,
fill_color = 'red',
fill_opacity = 0.6
).add_to(venues_food_map)
# added the restaurants as blue circle markers
for lat, lng, label in zip(dataframe_food_filtered.lat, dataframe_food_filtered.lng, dataframe_food_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='blue',
popup=label,
fill = True,
fill_color='blue',
fill_opacity=0.6
).add_to(venues_food_map)
# display map
venues_food_map
Two restraunts near the hotel returned. One is sushi restaurant and New American Restaurant. I looked the the reviews to find better option.
# Overall Rating
try:
print('The rating is', result_sushi['response']['venue']['rating'])
except:
print('This venue has not been rated yet.')
pd.set_option('display.max_colwidth',None)
tips_df = pd.json_normalize(tips) # json normalize tips
# columns to keep
filtered_columns = ['text', 'agreeCount', 'disagreeCount', 'id', 'user.firstName', 'user.lastName', 'user.id']
tips_filtered = tips_df.loc[:, filtered_columns]
# display tips
tips_filtered.reindex()
Based on the distance, rating, and review, I decided to add the sushi restaurant to the plan.
# Rating:
try:
print('The rating is', result_harvest['response']['venue']['rating'])
except:
print('This venue has not been rated yet.')
pd.set_option('display.max_colwidth',None)
tips_df_harvest = pd.json_normalize(tips) # json normalize tips
# columns to keep
filtered_columns = ['text', 'agreeCount', 'disagreeCount', 'id', 'user.firstName', 'user.lastName', 'user.id']
tips_filtered_harvest = tips_df_harvest.loc[:, filtered_columns]
# display tips
tips_filtered_harvest.reindex()
Although the rating is lower than sushi restaurant, the review is not bad. I decided to add this restaurant for option.
import requests
results_what_to_do_sushi = requests.get(url).json()
'There are {} around Sushi restaurant.'.format(len(results_what_to_do_sushi['response']['groups'][0]['items']))
# Sorted the places near the sushi restaurant by distance.
df_what_to_do_sushi_filtered.sort_values(by=['distance'], ascending=True, inplace= True)
df_what_to_do_sushi_filtered
New dataset shows there are more restaurant near the sushi place. For example, there are breakfast place 'The Original Pancake House' and mexican restaurant 'Cafe Rio Mexican Grill' . I am going to add these places for the options in case the client wants to experience variety food.
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe_shopping.columns if col.startswith('location.')] + ['id']
dataframe_shopping_filtered = dataframe_shopping.loc[:, filtered_columns]
# function that extracts the category of the venue
def get_category_type(row):
try:
categories_list = row['categories']
except:
categories_list = row['venue.categories']
if len(categories_list) == 0:
return None
else:
return categories_list[0]['name']
# filter the category for each row
dataframe_shopping_filtered['categories'] = dataframe_shopping_filtered.apply(get_category_type, axis=1)
# clean column names by keeping only last term
dataframe_shopping_filtered.columns = [column.split('.')[-1] for column in dataframe_shopping_filtered.columns]
dataframe_shopping_filtered
# Visualized the map:
shopping_map = folium.Map(location=[latitude_hotel, longitude_hotel], zoom_start=15) # generate map centred around Hyatt Place
# add a red circle marker to represent Hyatt Place
folium.CircleMarker(
[latitude_hotel, longitude_hotel],
radius=10,
color='red',
popup='Hyatt Place',
fill = True,
fill_color = 'red',
fill_opacity = 0.6
).add_to(shopping_map)
# add the outlet as purple circle markers
for lat, lng, label in zip(dataframe_shopping_filtered.lat, dataframe_shopping_filtered.lng, dataframe_shopping_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='purple',
popup=label,
fill = True,
fill_color='purple',
fill_opacity=0.6
).add_to(shopping_map)
# # display map
shopping_map
The map displays that the outlet and shopping areas are closed to the hotel. I am confident to add this shopping data to part of the planner.
# Removed unnecessary data such as None, Playground and Travel Agency in the categories.
dataframe_hiking_filtered.drop([2,3,5], inplace=True)
# Sorted data by distance
dataframe_hiking_filtered.sort_values(by=['distance'], ascending=True,inplace=True)
dataframe_hiking_filtered.head()
# Visualized the places:
hiking_map = folium.Map(location=[latitude_hotel, longitude_hotel], zoom_start=11) # generate map centred around the Hyatt Place.
# add a red circle marker to represent the Hyatt Place
folium.CircleMarker(
[latitude_hotel, longitude_hotel],
radius=10,
color='red',
popup='Hyatt Place',
fill = True,
fill_color = 'red',
fill_opacity = 0.6
).add_to(hiking_map)
# add the hiking areas as green circle markers
for lat, lng, label in zip(dataframe_hiking_filtered.lat, dataframe_hiking_filtered.lng, dataframe_hiking_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='green',
popup=label,
fill = True,
fill_color='green',
fill_opacity=0.6
).add_to(hiking_map)
# display map
hiking_map
There are three tails and one park in Lehi according to the data table. I am going to research related webpages or the pictures of each of places and suggest to the client.
Link to Lehi Rail Trail
Link to Murdock Canal Trail
# Pictures of Murdock Canal Trail
murdock_pic1 = Image.open(requests.get('https://www.utahmountainbiking.com/trails/images/pics-trails/Murdock02.jpg', stream=True).raw)
murdock_pic2 = Image.open(requests.get('https://saltproject.co/sites/default/files/images/Lindon/MurdockTrail/Murdock%20Canal%2016.jpg', stream=True).raw)
murdock_pic1
murdock_pic2
Link to Sensei Trail
# Pictures of Sensei Trail
sensei_pic1 = Image.open(requests.get('https://www.lehi-ut.gov/wp-content/uploads/2019/05/Branden-Henline-Trail-photo.jpg', stream=True).raw)
sensei_pic1
sensei_pic2 = Image.open(requests.get('http://www.utahmountainbiking.com/trails/images/pics-trails/TraverseMtn09.jpg', stream=True).raw)
sensei_pic2
sensei_pic3 = Image.open(requests.get('https://i0.wp.com/www.lehifreepress.com/wp-content/uploads/2019/05/Sensei-Trail-Image-4-provided-by-TMTA-1.jpg?fit=1256%2C595&ssl=1', stream=True).raw)
sensei_pic3
Link to Dry Creek Trail Park
# Pictures of Dry Creek Trail Park
drycreek_pic1 = Image.open(requests.get('https://utahsadventurefamily.com/wp-content/uploads/2016/03/IMG_5211-1140x855.jpg', stream=True).raw)
drycreek_pic1
drycreek_pic2 = Image.open(requests.get('https://scontent-sjc3-1.xx.fbcdn.net/v/t1.0-9/64687247_2056413144484608_296335116878217216_n.jpg?_nc_cat=105&ccb=2&_nc_sid=8bfeb9&_nc_ohc=WUSyQCQp37kAX_hMFp_&_nc_ht=scontent-sjc3-1.xx&oh=b77008fc60c1059bf8ddfdcb8050bb2b&oe=5FC47C85', stream=True).raw)
drycreek_pic2
Based on the reseached data, I was able to create the map around the hotel and completed the planner the client may prefer.
venues_map = folium.Map(location=[latitude_hotel, longitude_hotel], zoom_start=15)
# red circle represents the Hyatt Place.
folium.CircleMarker(
[latitude_hotel, longitude_hotel],
radius=10,
color='red',
popup='Hyatt Place',
fill = True,
fill_color = 'red',
fill_opacity = 0.6
).add_to(venues_map)
# green circles represent the trails.
for lat, lng, label in zip(dataframe_hiking_filtered.lat, dataframe_hiking_filtered.lng, dataframe_hiking_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='green',
popup=label,
fill = True,
fill_color='green',
fill_opacity=0.6
).add_to(venues_map)
# blue circles represent the restaurants.
for lat, lng, label in zip(dataframe_food_filtered.lat, dataframe_food_filtered.lng, dataframe_food_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='blue',
popup=label,
fill = True,
fill_color='blue',
fill_opacity=0.6
).add_to(venues_map)
# purple circles represent the shopping areas.
for lat, lng, label in zip(dataframe_shopping_filtered.lat, dataframe_shopping_filtered.lng, dataframe_shopping_filtered.categories):
folium.CircleMarker(
[lat, lng],
radius=5,
color='purple',
popup=label,
fill = True,
fill_color='purple',
fill_opacity=0.6
).add_to(venues_map)
# display map
venues_map
option_planner = draft_planner
option_planner
option_planner.loc[['Night'],['Day1']]= 'Return to the hotel and have a dinner at Tsunami Restaurant & Sushi Bar'
option_planner.loc[['Morning'],['Day2']]= 'Have a breakfast at The Original Pancake House and go for a walk at Murdock Canal Trail - Traverse Mountain'
option_planner.loc[['Evening'],['Day2']] = 'Visit Thanks Giving Point museum'
option_planner.loc[['Night'],['Day2']] = 'Have a dinner at Cafe Rio mexican restaurant and return to hotel'
option_planner.loc[['Morning'],['Day3']] = 'Have a breakfast at Harvest Restaurant'
option_planner.loc[['Evening'],['Day3']] = 'Go for a shopping at Outlets at Traverse Mountain'
option_planner
Like mentioned above, the temperture is expected comfortable autumn weather in Lehi, Utah. However, I am going to suggest to pack a winter jacket to the client because the temperture could get dropped as below as 34°F.
climate_oct
When I started this project I was expecting to measure travel time but could not find exact timing due to limited resources on public transportation. Thus, the final plan did not reflect the travel time well, which may cause errors in real time planning.
This project can provide guidlines for anyone who is planning to visit Lehi, Utah. Specially FourSquare provides useful information to explore various unvisited areas around Lehi. For example, it may give the distances between the places and reviews which can be helpful for a new visitor. However, despite FourSquare's usefullness, I experienced frustrtion gathering useful information. When trying to search various locations, I continued exceeding my daily limit which slowed down the research project.
Generally speaking, this project taught me to use various tools for data science research in future. Through this project, I plan to establish my portfolio by using and getting used to various data science tools.
from IPython.display import HTML
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')